Linear Contour Learning: A Method for Supervised Dimension Reduction

نویسندگان

  • Bing Li
  • Hongyuan Zha
  • Francesca Chiaromonte
چکیده

We propose a novel approach to sufficient di­ mension reduction in regression, based on es­ timating contour directions of negligible vari­ ation for the response surface. These di­ rections span the orthogonal complement of the minimal space relevant for the regression, and can be extracted according to a mea­ sure of the variation in the response, lead­ ing to General Contour Regression (GCR). In comparison to existing sufficient dimen­ sion reduction techniques, this contour-based methodology guarantees exhaustive estima­ tion of the central space under ellipticity of the predictor distribution and very mild ad­ ditional assumptions, while maintaining fo• consistency and computational ease. Moreover, it proves to be robust to departures from ellipticity. We also establish some use­ ful population properties for GCR. Simula­ tions to compare performance with that of standard techniques such as ordinary least squares, sliced inverse regression, principal hessian directions, and sliced average vari­ ance estimation confirm the advantages an­ ticipated by theoretical analyses. We also demonstrate the use of contour-based meth­ ods on a data set concerning grades of stu­ dents from Massachusetts colleges. Introduction and Background unsupervised approaches; here we consider dimen­ sion reduction for the regression of a continuous re­ sponse Y on a vector of continuous predictors X = (X 1 , . .. , X P) T E JRP . Our approach is based on suffi­ cient dimension reduction, a body of statistical theory and methods for reducing the dimension of X while preserving information on the regression; that is, on the conditional distribution of YjX. A dimension re­ duction subspace (Cook, 1998) is defined as the column span of any p x d ( d < p) matrix rJ such that

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gradient-based kernel dimension reduction for supervised learning

This paper proposes a novel kernel approach to linear dimension reduction for supervised learning. The purpose of the dimension reduction is to find directions in the input space to explain the output as effectively as possible. The proposed method uses an estimator for the gradient of regression function, based on the covariance operators on reproducing kernel Hilbert spaces. In comparison wit...

متن کامل

Constructing Interactive Visual Classification, Clustering and Dimension Reduction Models for n-D Data

The exploration of multidimensional datasets of all possible sizes and dimensions is a long-standing challenge in knowledge discovery, machine learning, and visualization. While multiple efficient visualization methods for n-D data analysis exist, the loss of information, occlusion, and clutter continue to be a challenge. This paper proposes and explores a new interactive method for visual disc...

متن کامل

مدل ترکیبی تحلیل مؤلفه اصلی احتمالاتی بانظارت در چارچوب کاهش بعد بدون اتلاف برای شناسایی چهره

In this paper, we first proposed the supervised version of probabilistic principal component analysis mixture model. Then, we consider a learning predictive model with projection penalties, as an approach for dimensionality reduction without loss of information for face recognition. In the proposed method, first a local linear underlying manifold of data samples is obtained using the supervised...

متن کامل

Analysis of Correlation Based Dimension Reduction Methods

Dimension reduction is an important topic in data mining and machine learning. Especially dimension reduction combined with feature fusion is an effective preprocessing step when the data are described by multiple feature sets. Canonical Correlation Analysis (CCA) and Discriminative Canonical Correlation Analysis (DCCA) are feature fusion methods based on correlation. However, they are differen...

متن کامل

Semi-supervised learning with Gaussian fields

Gaussian fields (GF) have recently received considerable attention for dimension reduction and semi-supervised classification. This paper presents two contributions. First, we show how the GF framework can be used for regression tasks on high-dimensional data. We consider an active learning strategy based on entropy minimization and a maximum likelihood model selection method. Second, we show h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004